CL-1806-conversion-ClP 3-5-04. ST25 
SEQUENCE LISTING 

<110> E.I. DuPont de Nemours, & Company 
Yadav, Narendra S. 
Yang, 3ianjun Gene 

<120> intein-Mediated Protein Splicing 

<130> CL1806 US CIP 

<150> US 60/354395 
<151> 2002-02-04 

<160> 78 

<170> Patentln version 3.2 

<210> 1 
<211> 123 
<212> PRT 

<213> Synechocystis sp. PCC6803 
<400> 1 

Cys Leu Ser Phe Gly Thr Glu lie Leu Thr val Glu Tyr Gly Pro Leu 
15 10 15 

Pro He Gly Lys lie Val Ser Glu Glu He Asn Cys Ser val Tyr Ser 
20 25 30 

val Asp Pro Glu Gly Arg val Tyr Thr Gin Ala lie Ala Gin Trp His 
35 40 45 

Asp Arg Gly Glu Gin Glu Val Leu Glu Tyr Glu Leu Glu Asp Gly Ser 
50 55 60 

val lie Arg Ala Thr Ser Asp His Arg Phe Leu Thr Thr Asp Tyr Gin 
65 ~ 70 75 80 

Leu Leu Ala lie Glu Glu lie Phe Ala Arg Gin Leu Asp Leu Leu Thr 
85 90 95 

Leu Glu Asn lie Lys Gin Thr Glu Glu Ala Leu Asp Asn His Arg Leu 
100 105 110 

Pro Phe Pro Leu Leu Asp Ala Gly Thr lie Lys 
115 120 

<210> 2 
<211> 37 
<212> PRT 

<213> Synechocystis sp. PCC6803 
<400> 2 

Met Val Lys Val lie Gly Arg Arg Ser Leu Gly val Gin Arg lie Phe 
1 5 " 10 15 
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CL-1806-conversion-CIP 3-5-04. ST25 
Asp lie Gly Leu Pro Gin Asp His Asn Phe Leu Leu Ala Asn Gly Ala 
20 25 30 

lie Ala Ala Asn cys 
35 

<210> 3 
<211> 75 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 3 

tgcctttctt tcggaactga gatccttacc gttgagtacg gaccacttcc tattggtaag 60 
atcgtttctg aggaa 75 

<210> 4 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 4 

attaactgct cagtgtactc tgttgatcca gaaggaagag tttacactca ggctatcgca 60 
caatggcacg atagg 75 

<210> 5 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 5 

ggtgaacaag aggttctcga gtacgagctt gaagatggat ccgttattcg tgctacctct 60 
gaccatagat tcttg 75 

<210> 6 r 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 6 

actacagatt atcagcttct cgctatcgag gaaatctttg ctaggcaact tgatctcctt 60 
actttggaga acatc 75 
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CL-1806-conversion-ClP 3-5-04. ST25 

<210> 7 
<211> 69 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 7 

aagcagacag aagaggctct tgacaaccac agacttccat tccctttgct cgatgctgga 
accatcaag 



<210> 8 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 8 

cttgatggtt ccagcatcga gcaaagggaa 



<210> 9 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 9 

tggaagtctg tggttgtcaa gagcctcttc tgtctgcttg atgttctcca aagtaaggag 
atcaagttgc ctagc 



<210> 10 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 10 

aaagatttcc tcgatagcga gaagctgata atctgtagtc aagaatctat ggtcagaggt 
agcacgaata acgga 



<210> 11 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 11 
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CL-1806-conversion-CIP 3-5-04 . ST25 
tccatcttca agctcgtact cgagaacctc ttgttcaccc ctatcgtgcc attgtgcgat 



60 



agcctgagtg taaac 



75 



<210> 12 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 12 

tcttccttct ggatcaacag agtacactga gcagttaatt tcctcagaaa cgatcttacc 60 
aataggaagt ggtcc 75 



<210> 13 

<211> 39 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 



<210> 14 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 14 

atggttaagg tgattggaag acgttctctt ggtgttcaaa ggatcttcga tatcggattg 60 
ccacaagacc acaac 75 



<210> 15 

<211> 36 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 



<210> 16 

<211> 75 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
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<400> 13 

gtactcaacg gtaaggatct cagttccgaa agaaaggca 



39 



<400> 15 

tttcttctcg ctaatggtgc catcgctgcc aattgc 



36 



CL-1806-conversion-ClP 3-5-04. ST25 

preferred codons 
<400> 16 

gcaattggca gcgatggcac cattagcgag aagaaagttg tggtcttgtg gcaatccgat 60 
atcgaagatc ctttg 75 

<210> 17 

<211> 36 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 17 

aacaccaaga gaacgtcttc caatcacctt aaccat 36 

<210> 18 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 18 

tgcctttctt tcggaactga g 21 

<210> 19 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 19 

tcacttgatg gttccagcat cgag 24 

<210> 20 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred codons 

<400> 20 

ccatggttaa ggtgattgga agac 24 

<210> 21 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803, to contain plant 
preferred, codons 
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CL-1806-conversion-ClP 3-5-04. ST25 



<400> 21 

gcaattggca gcgatggcac c 



21 



<210> 22 

<211> 369 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803 SspE, to contain plant 
preferred codons 



<220> 

<221> CDS 

<222> (1). .(369) 

<223> Modified from Synechocysti s sp. PCC6803 SspE, to contain plant 
preferred codons 



<400> 22 

tgc ctt tct ttc gqa act gag ate ctt acc gtt gag tac gqa 

cys Leu Ser Phe Gly Thr Glu lie Leu Thr val Glu Tyr Gly 
15 10 

cct att gqt aag ate gtt tct gag gaa att aac tgc tea gtq 

Pro lie Gly Lys lie val Ser Glu Glu lie Asn Cys Ser val 



20 



25 



30 



cca ctt 
Pro Leu 
15 

tac tct 
Tyr Ser 



48 



96 



gtt gat cca gaa gqa aga gtt tac act cag get ate gca caa 
val Asp Pro Glu Gly Arg val Tyr Thr Gin Ala lie Ala Gin 



35 



40 



45 



tgg cac 
Trp His 



144 



gat agg gqt gaa caa gag gtt etc gag tac gag ctt gaa gat 

Asp Arg Gly Glu Gin Glu Val Leu Glu Tyr Glu Leu Glu Asp 

50 55 60 

gtt att cgt get acc tct gac cat aga ttc ttg act aca gat 

val lie Arg Ala Thr Ser Asp His Arg Phe Leu Thr Thr Asp 

65 70 75 

ctt etc get ate gag gaa att ttt get agg caa ctt gat etc 

Leu Leu Ala lie Glu Glu lie Phe Ala Arg Gin Leu Asp Leu 

85 90 

ttg gag aac att aag cag aca gaa gag get ctt gac aac cac 

Leu Glu Asn lie Lys Gin Thr Glu Glu Ala Leu Asp Asn His 
100 105 110 

cca ttc cct ttg etc gat get gqa acc ate aag 

Pro Phe Pro Leu Leu Asp Ala Gly Thr lie Lys 

115 120 



gqa tec 
Gly Ser 



tat cag 
Tyr Gin 
80 

ctt act 
Leu Thr 
95 

aga ctt 
Arg Leu 



192 



240 



288 



336 



369 



<210> 23 
<211> 123 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Synthetic Construct 
<400> 23 

Cys Leu ser Phe Gly Thr Glu lie Leu Thr val Glu Tyr Gly Pro Leu 
15 10 15 
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CL-1806-conversion-CIP 3-5-04. ST25 

Pro lie Gly Lys lie val ser Glu Glu lie Asn cys Ser val Tyr ser 
20 25 30 

val Asp Pro Glu Gly Arg val Tyr Thr Gin Ala lie Ala Gin Trp His 
35 40 45 

Asp Arg Gly Glu Gin Glu val Leu Glu Tyr Glu Leu Glu Asp Gly Ser 
50 55 60 

val lie Arg Ala Thr ser Asp His Arg Phe Leu Thr Thr Asp Tyr Gin 
65 70 75 80 

Leu Leu Ala lie Glu Glu lie Phe Ala Arg Gin Leu Asp Leu Leu Thr 

85 * 90 95 

Leu Glu Asn lie Lys Gin Thr Glu Glu Ala Leu Asp Asn His Arg Leu 
100 105 110 

Pro Phe Pro Leu Leu Asp Ala Gly Thr lie Lys 
115 120 

<210> 24 
<211> 111 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Modified from Synechocysti s sp. PCC6803 SspE, to contain plant 
preferred codons 

<220> 

<221> CDS 

<222> (1) . . (Ill) 

<223> Modified from Synechocysti s sp. PCC6803 SspE, to contain plant 
preferred codons 

<400> 24 

atg gtt aag gtg att gga aga cgt tct ctt ggt gtt caa agg ate ttc 48 

Met val Lys val lie Gly Arg Arg ser Leu Gly val Gin Arg lie Phe 

15 " "* 10 15 

gat ate gga ttg cca caa gac cac aac ttt ctt etc get aat ggt gee 96 
Asp lie Gly Leu Pro Gin Asp His Asn Phe Leu Leu Ala Asn Gly Ala 
20 25 30 

ate get gca aat tgc 111 
lie Ala Ala Asn Cys 
35 

<210> 25 
<211> 37 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Synthetic Construct 
<400> 25 
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CL-1806-conversion-CIP 3-5-04. ST25 



Met val Lys 
1 



val lie Gly Arg Arg Ser Leu Gly val Gin Arg lie Phe 
5 10 15 



Asp lie Gly 



Leu Pro Gin Asp His Asn Phe Leu Leu Ala Asn Gly Ala 
20 25 30 



lie Ala Ala Asn Cys 
35 



<210> 26 

<211> 48 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer to introduce CDS for peptide MAHHHHHH at the N-terminus of 
GUS 

<400> 26 

atggctcatc atcatcatca tcatgtacgt cctgtagaaa ccccaacc 48 



<210> 27 

<211> 27 

<212> DNA 

<213> Arti f i ci al sequence 
<220> 

<223> Primer to introduce a BamHl site after the stop codon of GUS 

<400> 27 

ggatccttgt ttgcctccct gctgcgg 27 



<210> 28 

<211> 618 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Modified GUS protein, with 6x His tags on the N-terminus and 
c-terminus 



<220> 

<221> PEPTIDE 
<222> (1) . . (618) 

<223> Modified GUS protein, with 6x His tags on the N-terminus and 
c-terminus 

<400> 28 

Met Ala His His His His His His val Arg Pro val Glu Thr pro Thr 
15 10 15 



Arg Glu lie Lys Lys Leu Asp Gly Leu Trp Ala Phe Ser Leu Asp Arg 
20 25 30 



Glu Asn Cys Gly lie Asp Gin Arg Trp Trp Glu Ser Ala Leu Gin Glu 



35 



40 



45 
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CL-1806-conversion-CIP 3-5-04. ST25 

Ser Arg Ala lie Ala Val Pro Gly Ser Phe Asn Asp Gin Phe Ala Asp 

50 55 60 

Ala Asp lie Arg Asn Tyr Ala Gly Asn val Trp Tyr Gin Arg Glu val 

65 70 75 80 

Phe lie Pro Lys Gly Trp Ala Gly Gin Arg lie val Leu Arg Phe Asp 

85 90 95 

Ala val Thr His Tyr Gly Lys val Trp val Asn Asn Gin Glu val Met 

100 105 110 

Glu His Gin Gly Gly Tyr Thr Pro Phe Glu Ala Asp Val Thr Pro Tyr 

115 120 125 

val lie Ala Gly Lys Ser Val Arg lie Thr Val Cys Val Asn Asn Glu 

130 135 140 

Leu Asn Trp Gin Thr lie Pro Pro Gly Met Val lie Thr Asp Glu Asn 

145 150 155 160 

Gly Lys Lys Lys Gin ser Tyr Phe His Asp Phe Phe Asn Tyr Ala Gly 

165 170 175 

lie His Arg Ser val Met Leu Tyr Thr Thr Pro Asn Thr Trp val Asp 

180 185 190 

Asp lie Thr Val val Thr His val Ala Gin Asp Cys Asn His Ala Ser 

195 200 205 

val Asp Trp Gin val val Ala Asn Gly Asp val Ser val Glu Leu Arg 

210 215 220 

Asp Ala Asp Gin Gin val val Ala Thr Gly Gin Gly Thr ser Gly Thr 

225 230 235 240 

Leu Gin Val Val Asn Pro His Leu Trp Gin Pro Gly Glu Gly Tyr Leu 

245 250 255 

Tyr Glu Leu Cys Val Thr Ala Lys Ser Gin Thr Glu Cys Asp lie Tyr 

260 265 270 

Pro Leu Arg val Gly lie Arg Ser val Ala val Lys Gly Glu Gin Phe 

275 280 285 

Leu lie Asn His Lys Pro Phe Tyr Phe Thr Gly Phe Gly Arg His Glu 

290 295 300 

Asp Ala Asp Leu Arg Gly Lys Gly Phe Asp Asn val Leu Met val His 

305 310 315 320 
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CL-1806-conversion-CIP 3-5-04. ST25 
Asp His Ala Leu Met Asp Trp lie Gly Ala Asn Ser Tyr Arg Thr Ser 
325 330 335 

His Tyr Pro Tyr Ala Glu Glu Met Leu Asp Trp Ala Asp Glu His Gly 
340 345 350 

lie val val lie Asp Glu Thr Ala Ala val Gly Phe Asn Leu ser Leu 
355 360 365 

Gly lie Gly Phe Glu Ala Gly Asn Lys Pro Lys Glu Leu Tyr Ser Glu 
370 375 380 

Glu Ala val Asn Gly Glu Thr Gin Gin Ala His Leu Gin Ala lie Lys 
385 390 395 400 

Glu Leu lie Ala Arg Asp Lys Asn His Pro Ser val val Met Trp ser 
405 410 415 

lie Ala Asn Glu Pro Asp Thr Arg Pro Gin Gly Ala Arg Glu Tyr Phe 
420 425 430 

Ala Pro Leu Ala Glu Ala Thr Arg Lys Leu Asp Pro Thr Arg Pro lie 
435 440 445 

Thr cys val Asn val Met Phe Cys Asp Ala His Thr Asp Thr lie Ser 
450 455 460 

Asp Leu Phe Asp Val Leu Cys Leu Asn Arg Tyr Tyr Gly Trp Tyr val 
465 470 475 480 

Gin Ser Gly Asp Leu Glu Thr Ala Glu Lys val Leu Glu Lys Glu Leu 
485 490 495 

Leu Ala Trp Gin Glu Lys Leu His Gin Pro lie lie lie Thr Glu Tyr 
500 505 510 

Gly val Asp Thr Leu Ala Gly Leu His Ser Met Tyr Thr Asp Met Trp 
515 520 525 

ser Glu Glu Tyr Gin Cys Ala Trp Leu Asp Met Tyr His Arg val Phe 
530 535 540 

Asp Arg val ser Ala val val Gly Glu Gin val Trp Asn Phe Ala Asp 
545 550 555 560 

Phe Ala Thr Ser Gin Gly lie Leu Arg Val Gly Gly Asn Lys Lys Gly 
565 570 575 

lie Phe Thr Arg Asp Arg Lys Pro Lys Ser Ala Ala Phe Leu Leu Gin 
580 585 590 
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CL-1806-conversion-CIP 3-5-04. ST25 
Lys Arg Trp Thr Gly Met Asn Phe Gly Glu Lys Pro Gin Gin Gly Gly 
595 600 605 



Lys Gin Gly ser His His His His His His 
610 615 

<210> 29 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer to amplify GUS 

<400> 29 

cgcagcgtaa tgctctacac c 21 

<210> 30 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer to amplify GUS 

<400> 30 

ccgtaataac ggttcaggca c 21 

<210> 31 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer to amplify the 2 urn yeast replication origin and a Trp 
selective marker 

<400> 31 

agggaacaaa agctggagct ccaccagagg gccaagaggg agggc 45 

<210> 32 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer to amplify the 2 urn yeast replication origin and a Trp 
selective marker 

<400> 32 

cactagttct agagcggccg ccaccatatg atccaatatc aaagg 45 

<210> 33 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer for PCR-di rected recombination for in-frame fusion of 
GUS-n/ int-n and int-c/GUS-c 

<400> 33 
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CL-1806-conversion-ClP 3-5-04. ST25 
ggatctcagt tccgaaagaa aggcagtctt gcgcgacatg cgtca 



45 



<210> 34 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer for PCR-di rected recombination for in-frame fusion of 
GUS-n/ int-n and Int-c/GUS-c 

<400> 34 

cccctcgagg tcgacggtat cgatatccat ggctcatcat catca 45 



<210> 35 

<211> 45 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer for PCR-di rected recombination for in-frame fusion of 
GUS-n/ int-n and int-c/GUS-c 

<400> 35 

gtccgtactc aacggtaagg atctcgtctt gcgcgacatg cgtca 45 



<210> 36 

<211> 45 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer for PCR-di rected recombination for in-frame fusion of 
GUS-n/ int-n and int-c/GUS-c 

<400> 36 

cgctaatggt gccatcgctg ccaattgtaa ccacgcgtct gttga 45 



<210> 37 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer for PCR-di rected recombination for in-frame fusion of 
GUS-n/ int-n and Int-c/GUS-c 

<400> 37 

cgaggtcgac ggtatcgata ag 22 



<210> 38 

<211> 326 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> GUSn/Intn fusion 



<220> 

<221> PEPTIDE 
<222> (1). .C326) 
<223> GUSn/Intn fusion 
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CL-1806-conversion-CIP 3-5-04 . ST25 

<400> 38 

Met Ala His His His His His His Val Arg Pro Val Glu Thr Pro Thr 
15 10 15 

Arg Glu lie Lys Lys Leu Asp Gly Leu Trp Ala Phe Ser Leu Asp Arg 
20 25 30 

Glu Asn cys Gly lie Asp Gin Arg Trp Trp Glu Ser Ala Leu Gin Glu 
35 40 45 

Ser Arg Ala lie Ala Val Pro Gly Ser Phe Asn Asp Gin Phe Ala Asp 
50 55 60 

Ala Asp lie Arg Asn Tyr Ala Gly Asn Val Trp Tyr Gin Arg Glu Val 
65 70 75 " 80 

Phe lie Pro Lys Gly Trp Ala Gly Gin Arg lie Val Leu Arg Phe Asp 
85 90 95 

Ala val Thr His Tyr Gly Lys Val Trp Val Asn Asn Gin Glu val Met 
; 100 105 110 

Glu His Gin Gly Gly Tyr Thr Pro Phe Glu Ala Asp Val Thr Pro Tyr 
115 120 125 

i 

val lie Ala Gly Lys Ser val Arg lie Thr val Cys val Asn Asn Glu 
130 135 ~ 140 

Leu Asn Trp Gin Thr lie Pro Pro Gly Met val lie Thr Asp Glu Asn 
145 150 155 160 

Gly Lys Lys Lys Gin Ser Tyr Phe His Asp Phe Phe Asn Tyr Ala Gly 
165 170 175 

lie His Arg Ser val Met Leu Tyr Thr Thr Pro Asn Thr Trp val Asp 
180 185 190 

Asp lie Thr val val Thr His val Ala Gin Asp cys Leu Ser Phe Gly 
195 200 205 

Thr Glu lie Leu Thr val Glu Tyr Gly Pro Leu Pro lie Gly Lys lie 
210 215 220 

val Ser Glu Glu lie Asn Cys Ser val Tyr ser val Asp Pro Glu Gly 
225 230 235 240 

Arg val Tyr Thr Gin Ala lie Ala Gin Trp His Asp Arg Gly Glu Gin 
245 250 255 

Glu val Leu Glu Tyr Glu Leu Glu Asp Gly Ser val lie Arg Ala Thr 
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CL-1806-conversion-CIP 3-5-04. ST25 
260 265 270 

ser Asp His Arg Phe Leu Thr Thr Asp Tyr Gin Leu Leu Ala lie Glu 
275 280 285 

Glu lie Phe Ala Arg Gin Leu Asp Leu Leu Thr Leu Glu Asn lie Lys 
290 295 300 

Gin Thr Glu Glu Ala Leu Asp Asn His Arg Leu Pro Phe Pro Leu Leu 
305 310 315 320 

Asp Ala Gly Thr lie Lys 
325 

<210> 39 
<211> 320 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> GUSn/intn(6) fusion, with 6 amino acid deletion in Intn 
<220> 

<221> PEPTIDE 
<222> (1)..(320) 

<223> GUSn/lntn(6) fusion, with 6 amino acid deletion in Intn 
<400> 39 

Met Ala His His His His His His val Arg Pro val Glu Thr Pro Thr 
15 10 15 

Arg Glu lie Lys Lys Leu Asp Gly Leu Trp Ala Phe ser Leu Asp Arg 
20 25 30 

Glu Asn Cys Gly lie Asp Gin Arg Trp Trp Glu Ser Ala Leu Gin Glu 
35 40 45 

Ser Arg Ala lie Ala val Pro Gly ser Phe Asn Asp Gin Phe Ala Asp 
50 55 60 

Ala Asp lie Arg Asn Tyr Ala Gly Asn val Trp Tyr Gin Arg Glu Val 
65 70 75 80 

Phe lie Pro Lys Gly Trp Ala Gly Gin Arg lie Val Leu Arg Phe Asp 
85 90 95 

Ala val Thr His Tyr Gly Lys val Trp val Asn Asn Gin Glu val Met 
100 105 110 

Glu His Gin Gly Gly Tyr Thr Pro Phe Glu Ala Asp val Thr Pro Tyr 
115 120 125 

val lie Ala Gly Lys ser val Arg lie Thr val cys val Asn Asn Glu 
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CL-1806-conversion-CIP 3-5-04. ST25 
130 135 140 

Leu Asn Trp Gin Thr lie Pro Pro Gly Met Val lie Thr Asp Glu Asn 
145 150 155 160 

Gly Lys Lys Lys Gin Ser Tyr Phe His Asp Phe Phe Asn Tyr Ala Gly 
165 170 175 

lie His Arg Ser Val Met Leu Tyr Thr Thr Pro Asn Thr Trp val Asp 
180 185 190 

Asp lie Thr val val Thr His val Ala Gin Asp Glu lie Leu Thr val 
195 200 205 

Glu Tyr Gly Pro Leu Pro lie Gly Lys lie val Ser Glu Glu lie Asn 
210 215 220 

cys Ser Val Tyr ser val Asp Pro Glu Gly Arg Val Tyr Thr Gin Ala 
225 230 235 240 

lie Ala Gin Trp His Asp Arg Gly Glu Gin Glu val Leu Glu Tyr Glu 
245 250 255 

Leu Glu Asp Gly Ser val lie Arg Ala Thr ser Asp His Arg Phe Leu 
260 265 270 

Thr Thr Asp Tyr Gin Leu Leu Ala lie Glu Glu lie Phe Ala Arg Gin 
275 280 285 

Leu Asp Leu Leu Thr Leu Glu Asn lie Lys Gin Thr Glu Glu Ala Leu 
290 295 300 

Asp Asn His Arg Leu Pro Phe Pro Leu Leu Asp Ala Gly Thr lie Lys 
305 310 315 320 

<210> 40 

<211> 450 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> intc/GUSc fusion 
<220> 

<221> PEPTIDE 

<222> (1) . . (450) 

<223> Intc/GUSc fusion 

<400> 40 

Met val Lys val lie Gly Arg Arg ser Leu Gly val Gin Arg lie Phe 
1 5 10 15 

Asp lie Gly Leu Pro Gin Asp His Asn Phe Leu Leu Ala Asn Gly Ala 
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CL-1806-conversion-CIP 3-5-04. ST25 
20 25 30 

lie Ala Ala Asn Cys Asn His Ala Ser val Asp Trp Gin val val Ala 
35 40 45 

Asn Gly Asp val ser val Glu Leu Arg Asp Ala Asp Gin Gin val val 
50 55 60 

Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin val val Asn Pro His 
65 70 75 80 

Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys val Thr Ala 
85 90 95 

Lys Ser Gin Thr Glu cys Asp lie Tyr Pro Leu Arg val Gly lie Arg 
100 105 "* 110 

Ser Val Ala val Lys Gly Glu Gin Phe Leu lie Asn His Lys Pro Phe 
115 120 125 

Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 
130 135 140 

Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
145 150 155 160 

lie Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
165 ~ 170 175 

Met Leu Asp Trp Ala Asp Glu His Gly lie val Val lie Asp Glu Thr 
180 185 190 

Ala Ala val Gly Phe Asn Leu ser Leu Gly lie Gly Phe Glu Ala Gly 
195 200 205 

Asn Lys Pro Lys Glu Leu Tyr ser Glu Glu Ala val Asn Gly Glu Thr 
210 215 220 

Gin Gin Ala His Leu Gin Ala lie Lys Glu Leu lie Ala Arg Asp Lys 
225 230 235 240 

Asn His Pro ser val val Met Trp ser lie Ala Asn Glu Pro Asp Thr 
245 250 255 

Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 
260 ^ 265 270 

Arg Lys Leu Asp Pro Thr Arg Pro lie Thr Cys Val Asn Val Met Phe 
275 280 285 

Cys Asp Ala His Thr Asp Thr lie Ser Asp Leu Phe Asp val Leu Cys 
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290 295 300 

Leu Asn Arg Tyr Tyr Gly Trp Tyr val Gin ser Gly Asp Leu Glu Thr 
305 310 315 320 

Ala Glu Lys val Leu Glu Lys Glu Leu Leu Trp Gin Glu Lys Leu His 
325 330 335 

Gin Pro lie lie lie Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly Leu 
340 345 350 

His Ser Met Tyr Thr Asp Met Trp ser Glu Glu Tyr Gin Cys Ala Trp 
355 360 365 

Leu Asp Met Tyr His Arg Val Phe Asp Arg val Ser Ala Val Val Gly 
370 375 380 

Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly lie Leu 
385 390 395 400 

Arg val Gly Gly Asn Lys Lys Gly lie Phe Thr Arg Asp Arg Lys Pro 
405 410 415 

Lys Ser Ala Ala Phe Leu Leu Gin Lys Arg Trp Thr Gly Met Asn Phe 
420 425 430 

Gly Glu Lys Pro Gin Gin Gly Gly Lys Gin Gly ser His His His His 
435 440 445 

His His 
450 

<210> 41 

<211> 41 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer to amplify NOS terminator region 

<400> 41 

gcgtcgacag tcactctaga gacatcgatc tagtaacata g 41 

<210> 42 

<211> 41 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer to amplify NOS terminator region 

<400> 42 

ggggtacccc atgcggccgc ctaaagaagg agtgcgtcga a 41 



<210> 43 
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<211> 18 
<212> DNA 

<213> Arti f i ci al Sequence 
<220> 

<223> 18 bp polyl inker 
<400> 43 

ggtacccgat ccaattcc 18 

<210> 44 

<211> 26 

<212> DNA 

<213> Arti f i ci al Sequence 
<220> 

<223> Primer 

<400> 44 

gaccatggcc aatttactga ccgtac 26 

<210> 45 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 45 

cgaaagaaag gcagcagcga tcgctat 27 

<210> 46 

<211> 29 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 46 

atagcgatcg ctgctgcctt tctttcgga 29 

<210> 47 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 47 

atgtcgactc acttgatggt tccagca 27 

<210> 48 

<211> 3034 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence of 3034 bp Asp 718 fragment containing 
35S-creN-lntN-3 'ocs gene in plasmid pGV947 
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<400> 48 
ggtacccgat 


ccaattccaa 


tcccacaaaa 


atctgagctt 


aacagcacag 


ttgctcctct 


60 


cagagcagaa 


tcgggtattc 


aacaccctca 


tatcaactac 


tacgttgtgt 


ataacggtcc 


120 


acatgccggt 


atatacgatg 


actggggttg 


tacaaaggcg 


gcaacaaacg 


gcgttcccgg 


180 


agttgcacac 


aagaaatttg 


ccactattac 


agaggcaaga 


gcagcagctg 


acgcgtacac 


240 


aacaagtcag 


caaacagaca 


ggttgaactt 


catccccaaa 


ggagaagctc 


aactcaagcc 


300 


caagagcttt 


gctaaggccc 


taacaagccc 


accaaagcaa 


aaagcccact 


ggctcacgct 


360 


aggaaccaaa 


aggcccagca 


gtgatccagc 


cccaaaagag 


atctcctttg 


ccccggagat 


420 


tacaatggac 


gatttcctct 


atctttacga 


tctaggaagg 


aagttcgaag 


gtgaaggtga 


480 


cgacactatg 


ttcaccactg 


ataatgagaa 


ggttagcctc 


ttcaatttca 


gaaagaatgc 


540 


tgacccacag 


atggttagag 


aggcctacgc 


agcaggtctc 


atcaagacga 


tctacccgag 


600 


taacaatctc 


caggagatca 


aataccttcc 


caagaaggtt 


aaagatgcag 


tcaaaagatt 


660 


caggactaat 


tgcatcaaga 


acacagagaa 


agacatattt 


ctcaagatca 


gaagtactat 


720 


tccagtatgg 


acgattcaag 


gcttgcttca 


taaaccaagg 


caagtaatag 


agattggagt 


780 


ctctaaaaag 


gtagttccta 


ctgaatctaa 


ggccatgcat 


ggagtctaag 


attcaaatcg 


840 


aggatctaac 


agaactcgcc 


gtgaagactg 


gcgaacagtt 


catacagagt 


cttttacgac 


900 


tcaatgacaa 


gaagaaaatc 


ttcgtcaaca 


tggtggagca 


cgacactctg 


gtctactcca 


960 


aaaatgtcaa 


agatacagtc 


tcagaagacc 


aaagggctat 


tgagactttt 


caacaaagga 


1020 


taatttcggg 


aaacctcctc 


ggattccatt 


gcccagctat 


ctgtcacttc 


atcgaaagga 


1080 


cagtagaaaa 


ggaaggtggc 


tcctacaaat 


gccatcattg 


cgataaagga 


aaggctatca 


1140 


ttcaagatgc 


ctctgccgac 


agtggtccca 


aagatggacc 


cccacccacg 


aggagcatcg 


1200 


tggaaaaaga 


agacgttcca 


accacgtctt 


caaagcaagt 


ggattgatgt 


gacatctcca 


1260 


ctgacgtaag 


ggatgacgca 


caatcccact 


atccttcgca 


agacccttcc 


tctatataag 


1320 


gaagttcatt 


tcatttggag 


aggacacgct 


cgagctcatt 


tctctattac 


ttcagccata 


1380 


acaaaagaac 


tcttttctct 


tcttattaaa 


ccatggccaa 


tttactgacc 


gtacaccaaa 


1440 


atttgcctgc 


attaccggtc 


gatgcaacga 


gtgatgaggt 


tcgcaagaac 


ctgatggaca 


1500 


tgttcaggga 


tcgccaggcg 


ttttctgagc 


atacctggaa 


aatgcttctg 


tccgtttgcc 


1560 


ggtcgtgggc 


ggcatggtgc 


aagttgaata 


accggaaatg 


gtttcccgca 


gaacctgaag 


1620 


atgttcgcga 


ttatcttcta 


tatcttcagg 


cgcgcggtct 


ggcagtaaaa 


actatccagc 


1680 


aacatttggg 


ccagctaaac 


atgcttcatc 


gtcggtccgg 


gctgccacga 


ccaagtgaca 


1740 


gcaatgctgt 


ttcactagtt 


atgcggcgga 


tccgaaaaga 


aaacgttgat 


gccggtgaac 


1800 


gtgcaaaaca 


ggctctagcg 


ttcgaacgca 


ctgatttcga 


ccaggttcgt 


tcactcatgg 


1860 


aaaatagcga 


tcgctgctgc 


ctttctttcg 


gaactgagat 


ccttaccgtt 


gagtacggac 


1920 


cacttcctat 


tggtaagatc 


gtttctgagg 


aaattaactg 


ctcagtgtac 


tctgttgatc 


1980 


cagaaggaag 


agtttacact 


caggctatcg 


cacaatggca 


cgataggggt 


gaacaagagg 


2040 
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ttctcgagta 


cgagcttgaa 


gatggatccg 


ttattcgtgc 


tacctctgac 


catagattct 


2100 


tgactacaga 


ttatcagctt 


ctcgctatcg 


aggaaatctt 


tgctaggcaa 


cttgatctcc 


2160 


ttactttgga 


gaacatcaag 


cagacagaag 


aggctcttga 


caaccacaga 


cttccattcc 


2220 


ctttgctcga 


tgctggaacc 


atcaagtgag 


tcgacataat 


cactagagtc 


ctgctttaat 


2280 


gagatatgcg 


agacgcctat 


gatcgcatga 


tatttgcttt 


caattctgtt 


gtgcacgttg 


2340 


taaaaaacct 


gagcatgtgt 


agctcagatc 


cttaccgccg 


gtttcggttc 


attctaatga 


2400 


atatatcacc 


cgttactatc 


gtatttttat 


gaataatatt 


ctccgttcaa 


tttactgatt 


2460 


gtaccctact 


acttatatgt 


acaatattaa 


aatgaaaaca 


atatattgtg 


ctgaataggt 


2520 


ttatagcgac 


atctatgata 


gagcgccaca 


ataacaaaca 


attgcgtttt 


attattacaa 


2580 


atccaatttt 


aaaaaaagcg 


gcagaaccgg 


tcaaacctaa 


aagactgatt 


acataaatct 


2640 


tattcaaatt 


tcaaaaggcc 


ccaggggcta 


gtatctacga 


cacaccgagc 


ggcgaactaa 


2700 


taacgttcac 


tgaagggaac 


tccggttccc 


cgccggcgcg 


catgggtgag 


attccttgaa 


2760 


nl"l"nanl~al""t* 

y i_ Lyay i_a l. v_ 


yy *-*-y *-*-*-y <- 


tctaccgaaa gttacgggca 




^ y y LLLayLa 


7R70 


cggcggccgg 


gtaaccgact 


tgctgccccg 


agaattatgc 


agcatttttt 


tggtgtatgt 


2880 


gggccccaaa 


tgaagtgcag 


gtcaaacctt 


gacagtgacg 


acaaatcgtt 


gggcgggtcc 


2940 


agggcgaatt 


ttgcgacaac 


atgtcgaggc 


tcagcaggac 


ctgcaggcat 


gcaagcttat 


3000 


cgataccgtc 


gacctcgagg 


gggggcccgg 


tacc 






3034 



<210> 49 

<211> 17 

<212> DNA 

<213> Arti f i ci al Sequence 
<220> 

<223> 17 bp linker sequence 

<400> 49 

gtcgacataa tcactag 17 



<210> 50 

<211> 60 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 60 bp poly! inker 

<400> 50 

caggacctgc aggcatgcaa gcttatcgat accgtcgacc tcgagggggg gcccggtacc 60 



<210> 51 

<211> 27 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
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<400> 51 

gaccatggtt aaggtgattg gaagacg 



27 



<210> 52 

<211> 30 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 

<400> 52 

tacgtatatc ctggcaattg gcagcgatgg 



<210> 53 

<211> 31 

<212> DNA 

<213> Arti f i ci al Sequence 
<220> 

<223> Primer 

<400> 53 

cgctgccaat tgccaggata tacgtaatct g 



30 



31 



<210> 54 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 54 

agtcgaccta atcgccatct tccagcag 28 



<210> 55 

<211> 2873 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Sequence of 2873 bp Asp718 fragment containing 
35S:intC-CreC:3'ocs in plasmid pGV951 

<400> 55 

ggtacccgat ccaattccaa tcccacaaaa atctgagctt aacagcacag ttgctcctct 60 

cagagcagaa tcgggtattc aacaccctca tatcaactac tacgttgtgt ataacggtcc 120 

acatgccggt atatacgatg actggggttg tacaaaggcg gcaacaaacg gcgttcccgg 180 

agttgcacac aagaaatttg ccactattac agaggcaaga gcagcagctg acgcgtacac 240 

aacaagtcag caaacagaca ggttgaactt catccccaaa ggagaagctc aactcaagcc 300 

caagagcttt gctaaggccc taacaagccc accaaagcaa aaagcccact ggctcacgct 360 

aggaaccaaa aggcccagca gtgatccagc cccaaaagag atctcctttg ccccggagat 420 

tacaatggac gatttcctct atctttacga tctaggaagg aagttcgaag gtgaaggtga 480 

cgacactatg ttcaccactg ataatgagaa ggttagcctc ttcaatttca gaaagaatgc 540 
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tgacccacag 


atggttagag 


aggcctacgc 


agcaggtctc 


atcaagacga 


tctacccgag 


600 


taacaatctc 


caggagatca 


aataccttcc 


caagaaggtt 


aaagatgcag 


tcaaaagatt 


660 


caggactaat 


tgcatcaaga 


acacagagaa 


agacatattt 


ctcaagatca 


gaagtactat 


720 


tccagtatgg 


acgattcaag 


gcttgcttca 


taaaccaagg 


caagtaatag 


agattggagt 


780 


ctctaaaaag 


gtagttccta 


ctgaatctaa 


ggccatgcat 


ggagtctaag 


attcaaatcg 


840 


aggatctaac 


agaactcgcc 


gtgaagactg 


gcgaacagtt 


catacagagt 


cttttacgac 


900 


tcaatgacaa 


gaagaaaatc 


ttcgtcaaca 


tggtggagca 


cgacactctg 


gtctactcca 


960 


aaaatgtcaa 


agatacagtc 


tcagaagacc 


aaagggctat 


tgagactttt 


caacaaagga 


1020 


taatttcggg 


aaacctcctc 


ggattccatt 


gcccagctat 


ctgtcacttc 


atcgaaagga 


1080 


cagtagaaaa 


ggaaggtggc 


tcctacaaat 


gccatcattg 


cgataaagga 


aaggctatca 


1140 


ttcaagatgc 


ctctgccgac 


agtggtccca 


aagatggacc 


cccacccacg 


aggagcatcg 


1200 


tggaaaaaga 


agacgttcca 


accacgtctt 


caaagcaagt 


ggattgatgt 


gacatctcca 


1260 


ctgacgtaag 


ggatgacgca 


caatcccact 


atccttcgca 


agacccttcc 


tctatataag 


1320 


gaagttcatt 


tcatttggag 


aggacacgct 


cgagctcatt 


tctctattac 


ttcagccata 


1380 


acaaaagaac 


tcttttctct 


tcttattaaa 


ccatggttaa 


ggtgattgga 


agacgttctc 


1440 


ttggtgttca 


aaggatcttc 


gatatcggat 


tgccacaaga 


ccacaacttt 


cttctcgcta 


1500 


atggtgccat 


cgctgccaat 


tgccaggata 


tacgtaatct 


ggcatttctg 


gggattgctt 


1560 


ataacaccct 


gttacgtata 


gccgaaattg 


ccaggatcag 


ggttaaagat 


atctcacgta 


1620 


ctgacggtgg 


gagaatgtta 


atccatattg 


gcagaacgaa 


aacgctggtt 


agcaccgcag 


1680 


gtgtagagaa 


ggcacttagc 


ctgggggtaa 


ctaaactggt 


cgagcgatgg 


atttccgtct 


1740 


ctggtgtagc 


tgatgatccg 


aataactacc 


tgttttgccg 


ggtcagaaaa 


aatggtgttg 


1800 


ccgcgccatc 


tgccaccagc 


cagctatcaa 


ctcgcgccct 


ggaagggatt 


tttgaagcaa 


1860 


ctcatcgatt 


gatttacggc 


gctaaggatg 


actctggtca 


gagatacctg 


gcctggtctg 


1920 


gacacagtgc 


ccgtgtcgga 


gccgcgcgag 


atatggcccg 


cgctggagtt 


tcaataccgg 


1980 


agatcatgca 


agctggtggc 


tggaccaatg 


taaatattgt 


catgaactat 


atccgtaacc 


2040 


tggatagtga 


aacaggggca 


atggtgcgcc 


tgctggaaga 


tggcgattag 


gtcgactatc 


2100 


actagagtcc 


tgctttaatg 


agatatgcga 


gacgcctatg 


atcgcatgat 


atttgctttc 


2160 


aattctgttg 


tgcacgttgt 


aaaaaacctg 


agcatgtgta 


gctcagatcc 


ttaccgccgg 


2220 


tttcggttca 


ttctaatgaa 


tatatcaccc 


gttactatcg 


tatttttatg 


aataatattc 


2280 


tccgttcaat 


ttactgattg 


taccctacta 


cttatatgta 


caatattaaa 


atgaaaacaa 


2340 


tatattgtgc 


tgaataggtt 


tatagcgaca 


tctatgatag 


agcgccacaa 


taacaaacaa 


2400 


ttgcgtttta 


ttattacaaa 


tccaatttta 


aaaaaagcgg 


cagaaccggt 


caaacctaaa 


2460 


agactgatta 


cataaatctt 


attcaaattt 


caaaaggccc 


caggggctag 


tatctacgac 


2520 


acaccgagcg 


gcgaactaat 


aacgttcact 


gaagggaact 


ccggttcccc 


gccggcgcgc 


2580 
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atgggtgaga 


ttccttgaag 


ttgagtattg 


gccgtccgct 


ctaccgaaag 


ttacgggcac 


2640 


cattcaaccc 


ggtccagcac 


ggcggccggg 


taaccgactt 


gctgccccga 


gaattatgca 


2700 


gcattttttt 


ggtgtatgtg 


ggccccaaat 


gaagtgcagg 


tcaaaccttg 


acagtgacga 


2760 


caaatcgttg 


ggcgggtcca 


gggcgaattt 


tgcgacaaca 


tgtcgaggct 


cagcaggacc 


2820 


tgcaggcatg 


caagcttatc 


gataccgtcg 


acctcgaggg 


ggggcccggt 


acc 


2873 



<210> 56 

<211> 14 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 15 bp linker 

<400> 56 

tcgactatca ctag 14 

<210> 57 

<211> 5449 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> sequence of 5449 bp sal I-Hindlll fragment containing the blocked 
GUS reporter gene for Cre-Lox excision in plasmid pGV801 



<400> 57 
gtcgactcta 


gaggatccaa 


ttccaatccc 


acaaaaatct 


gagcttaaca 


gcacagttgc 


60 


tcctctcaga 


gcagaatcgg 


gtattcaaca 


ccctcatatc 


aactactacg 


ttgtgtataa 


120 


cggtccacat 


gccggtatat 


acgatgactg 


gggttgtaca 


aaggcggcaa 


caaacggcgt 


180 


tcccggagtt 


gcacacaaga 


aatttgccac 


tattacagag 


gcaagagcag 


cagctgacgc 


240 


gtacacaaca 


agtcagcaaa 


cagacaggtt 


gaacttcatc 


cccaaaggag 


aagctcaact 


300 


caagcccaag 


agctttgcta 


aggccctaac 


aagcccacca 


aagcaaaaag 


cccactggct 


360 


cacgctagga 


accaaaaggc 


ccagcagtga 


tccagcccca 


aaagagatct 


cctttgcccc 


420 


ggagattaca 


atggacgatt 


tcctctatct 


ttacgatcta 


ggaaggaagt 


tcgaaggtga 


480 


aggtgacgac 


actatgttca 


ccactgataa 


tgagaaggtt 


agcctcttca 


atttcagaaa 


540 


gaatgctgac 


ccacagatgg 


ttagagaggc 


ctacgcagca 


ggtctcatca 


agacgatcta 


600 


cccgagtaac 


aatctccagg 


agatcaaata 


ccttcccaag 


aaggttaaag 


atgcagtcaa 


660 


aagattcagg 


actaattgca 


tcaagaacac 


agagaaagac 


atatttctca 


agatcagaag 


720 


tactattcca 


gtatggacga 


ttcaaggctt 


gcttcataaa 


ccaaggcaag 


taatagagat 


780 


tggagtctct 


aaaaaggtag 


ttcctactga 


atctaaggcc 


atgcatggag 


tctaagattc 


840 


aaatcgagga 


tctaacagaa 


ctcgccgtga 


agactggcga 


acagttcata 


cagagtcttt 


900 


tacgactcaa 


tgacaagaag 


aaaatcttcg 


tcaacatggt 


ggagcacgac 


actctggtct 


960 


actccaaaaa 


tgtcaaagat 


acagtctcag 


aagaccaaag 


ggctattgag 


acttttcaac 


1020 


aaaggataat 


ttcgggaaac 


ctcctcggat 


tccattgccc agctatctgt 
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aaaggacagt 


agaaaaggaa 


ggtggctcct 


acaaatgcca 


tcattgcgat 


aaaggaaagg 


1140 


ctatcattca 


agatgcctct 


gccgacagtg 


gtcccaaaga 


tggaccccca 


cccacgagga 


1200 


gcatcgtgga 


aaaagaagac 


gttccaacca 


cgtcttcaaa 


gcaagtggat 


tgatgtgaca 


1260 


tctccactga 


cgtaagggat 


gacgcacaat 


cccactatcc 


ttcgcaagac 


ccttcctcta 


1320 


tataaggaag 


ttcatttcat 


ttggagagga 


cacgctcgag 


ctcatttctc 


tattacttca 


1380 


gccataacaa 


aagaactctt 


ttctcttctt 


attaaaccat 


gataacttcg 


tatagcatac 


1440 


attatacgaa 


gttatcctag 


gatcatgagc 


ggagaattaa 


gggagtcacg 


ttatgacccc 


1500 


cgccgatgac 


gcgggacaag 


ccgttttacg 


tttggaactg 


acagaaccgc 


aacgttgaag 


1560 


gagccactca 


gccgcgggtt 


tctggagttt 


aatgagctaa 


gcacatacgt 


cagaaaccat 


1620 


tattgcgcgt 


tcaaaagtcg 


cctaaggtca 


ctatcagcta 


gcaaatattt 


cttgtcaaaa 


1680 


atgctccact 


gacgttccat 


aaattcccct 


cggtatccaa 


ttagagtctc 


atattcactc 


1740 


tcaatccaaa 


taatctgcac 


cggatctgga 


tcgtttcgca 


tgattgaaca 


agatggattg 


1800 


cacgcaggtt 


ctccggccgc 


ttgggtggag 


aggctattcg 


gctatgactg 


ggcacaacag 


1860 


acaatcggct 


gctctgatgc 


cgccgtgttc 


cggctgtcag 


cgcaggggcg 


cccggttctt 


1920 


tttgtcaaga 


ccgacctgtc 


cggtgccctg 


aatgaactgc 


aggacgaggc 


agcgcggcta 


1980 


tcgtggctgg 


ccacgacggg 


cgttccttgc 


gcagctgtgc 


tcgacgttgt 


cactgaagcg 


2040 


ggaagggact 


ggctgctatt 


gggcgaagtg 


ccggggcagg 


atctcctgtc 


atctcacctt 


2100 


gctcctgccg 


agaaagtatc 


catcatggct 


gatgcaatgc 


ggcggctgca 


tacgcttgat 


2160 


ccggctacct 


gcccattcga 


ccaccaagcg 


aaacatcgca 


tcgagcgagc 


acgtactcgg 


2220 


atggaagccg 


gtcttgtcga 


tcaggatgat 


ctggacgaag 


agcatcaggg 


gctcgcgcca 


2280 


gccgaactgt 


tcgccaggct 


caaggcgcgc 


atgcccgacg 


gcgatgatct 


cgtcgtgacc 


2340 


catggcgatg 


cctgcttgcc 


gaatatcatg 


gtggaaaatg 


gccgcttttc 


tggattcatc 


2400 


gactgtggcc 


ggctgggtgt 


ggcggaccgc 


tatcaggaca 


tagcgttggc 


tacccgtgat 


2460 


attgctgaag 


agcttggcgg 


cgaatgggct 


gaccgcttcc 


tcgtgcttta 


cggtatcgcc 


2520 


gctcccgatt 


cgcagcgcat 


cgccttctat 


cgccttcttg 


acgagttctt 


ctgagcggga 


2580 


ctctggggtt 


cgaaatgacc 


gaccaagcga 


cgcccaacct 


gccatcacga 


gatttcgatt 


2640 


ccaccgccgc 


cttctatgaa 


aggttgggct 


tcggaatcgt 


tttccgggac 


gccggctgga 


2700 


tgatcctcca 


gcgcggggat 


ctcatgctgg 


agttcttcgc 


ccacgggatc 


tctgcggaac 


2760 


aggcggtcga 


aggtgccgat 


atcattacga 


cagcaacggc 


cgacaagcac 


aacgccacga 


2820 


tcctgagcga 


caatatgatc 


gggcccggcg 


tccacatcaa 


cggcgtcggc 


ggcgactgcc 


2880 


caggcaagac 


cgagatgcac 


cgcgatatct 


tgctgcgttc 


ggatattttc 


gtggagttcc 


2940 


cgccacagac 


ccggatgatc 


cccgatcgtt 


caaacatttg 


gcaataaagt 


ttcttaagat 


3000 


tgaatcctgt 


tgccggtctt 


gcgatgatta 


tcatataatt 


tctgttgaat 


tacgttaagc 


3060 


atgtaataat 


taacatgtaa 


tgcatgacgt 


tatttatgag atgggttttt 
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atgattagag 


3120 
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tcccgcaatt 


atacatttaa 


tacgcgatag 


aaaacaaaat 


atagcgcgca 


aactaggata 


3180 


aattatcgcg 


cgcggtgtca 


tctatgttac 


tagatcgggc 


ctcctgtcaa 


tgctggccta 


3240 


ggtaaataac 


ttcgtatagc 


atacattata 


cgaagttatt 


agtacgtcct 


gtagaaaccc 


3300 


caacccgtga 


aatcaaaaaa 


ctcgacggcc 


tgtgggcatt 


cagtctggat 


cgcgaaaact 


3360 


gtggaattga 


tcagcgttgg 


tgggaaagcg 


cgttacaaga 


aagccgggca 


attgctgtgc 


3420 


caggcagttt 


taacgatcag 


ttcgccgatg 


cagatattcg 


taattatgcg 


ggcaacgtct 


3480 


ggtatcagcg 


cgaagtcttt 


ataccgaaag 


gttgggcagg 


ccagcgtatc 


gtgctgcgtt 


3540 


tcgatgcggt 


cactcattac 


ggcaaagtgt 


gggtcaataa 


tcaggaagtg 


atggagcatc 


3600 


agggcggcta 


tacgccattt 


gaagccgatg 


tcacgccgta 


tgttattgcc 


gggaaaagtg 


3660 


tacgtatcac 


cgtttgtgtg 


aacaacgaac 


tgaactggca 


gactatcccg 


ccgggaatgg 


3720 


tgattaccga 


cgaaaacggc 


aagaaaaagc 


agtcttactt 


ccatgatttc 


tttaactatg 


3780 


ccggaatcca 


tcgcagcgta 


atgctctaca 


ccacgccgaa 


cacctgggtg 


gacgatatca 


3840 


ccgtggtgac 


gcatgtcgcg 


caagactgta 


accacgcgtc 


tgttgactgg 


caggtggtgg 


3900 


ccaatggtga 


tgtcagcgtt 


gaactgcgtg 


atgcggatca 


acaggtggtt 


gcaactggac 


3960 


aaggcactag 


cgggactttg 


caagtggtga 


atccgcacct 


ctggcaaccg 


ggtgaaggtt 


4020 


atctctatga 


actgtgcgtc 


acagccaaaa 


gccagacaga 


gtgtgatatc 


tacccgcttc 


4080 


gcgtcggcat 


ccggtcagtg 


gcagtgaagg 


gccaacagtt 


cctgattaac 


cacaaaccgt 


4140 


tctactttac 


tggctttggt 


cgtcatgaag 


atgcggactt 


acgtggcaaa 


ggattcgata 


4200 


acgtgctgat 


ggtgcacgac 


cacgcattaa 


tggactggat 


tggggccaac 


tcctaccgta 


4260 


cctcgcatta 


cccttacgct 


gaagagatgc 


tcgactgggc 


agatgaacat 


ggcatcgtgg 


4320 


tgattgatga 


aactgctgct 


gtcggcttta 


acctctcttt 


aggcattggt 


ttcgaagcgg 


4380 


gcaacaagcc 


gaaagaactg 


tacagcgaag 


aggcagtcaa 


cggggaaact 


cagcaagcgc 


4440 


acttacaggc 


gattaaagag 


ctgatagcgc 


gtgacaaaaa 


ccacccaagc 


gtggtgatgt 


4500 


ggagtattgc 


caacgaaccg 


gatacccgtc 


cgcaagtgca 


cgggaatatt 


tcgccactgg 


4560 


cggaagcaac 


gcgtaaactc 


gacccgacgc 


gtccgatcac 


ctgcgtcaat 


gtaatgttct 


4620 


gcgacgctca 


caccgatacc 


atcagcgatc 


tctttgatgt 


gctgtgcctg 


aaccgttatt 


4680 


acggatggta 


tgtccaaagc 


ggcgatttgg 


aaacggcaga 


gaaggtactg 


gaaaaagaac 


4740 


ttctggcctg 


gcaggagaaa 


ctgcatcagc 


cgattatcat 


caccgaatac 


ggcgtggata 


4800 


cgttagccgg 


gctgcactca 


atgtacaccg 


acatgtggag 


tgaagagtat 


cagtgtgcat 


4860 


y y v_ l y y a. lci l 




ntrtttnatr 


y LLay v_yv_ 


*-y L< -y L *-yy L 


naaranntat 
yaacayy Let l 




ggaatttcgc 


cgattttgcg 


acctcgcaag 


gcatattgcg 


cgttggcggt 


aacaagaaag 


4980 


ggatcttcac 


tcgcgaccgc 


aaaccgaagt 


cggcggcttt 


tctgctgcaa 


aaacgctgga 


5040 


ctggcatgaa 


cttcggtgaa 


aaaccgcagc 


agggaggcaa 


acaatgaatc 


aacaactctc 


5100 


ctggcgcacc 


atcgtcggct 


acagcctcgg 


tggggaattc 


cccgggggta 


cctaaagaag 


5160 
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gagtgcgtcg aagcagatcg ttcaaacatt tggcaataaa gtttcttaag attgaatcct 5220 

gttgccggtc ttgcgatgat tatcatataa tttctgttga attacgttaa gcatgtaata 5280 

attaacatgt aatgcatgac gttatttatg agatgggttt ttatgattag agtcccgcaa 5340 

ttatacattt aatacgcgat agaaaacaaa atatagcgcg caaactagga taaattatcg 5400 

cgcgcggtgt catctatgtt actagatcga tgtcgactct agaaagctt 5449 



<210> 58 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 24 bp polyl inker 

<400> 58 

gtcgactcta gaggatccaa ttcc 24 



<210> 59 

<211> 34 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Lox P sequence 
<400> 59 

ataacttcgt atagcataca ttatacgaag ttat 



34 



<210> 60 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 22 bp polylinker 

<400> 60 

tggggaattc cccgggggta cc 22 



<210> 61 

<211> 18 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 18 bp polylinker 

<400> 61 

gtcgactcta gaaagctt 18 



<210> 62 
<211> 605 
<212> PRT 

<213> Artificial sequence 
<220> 

<223> Elastin-based protein polymer (Zhang et al . , Plant cell Rep. 
16(3-4): 174-179 (1996)) 
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<400> 62 

Gly val Gly val Pro Gly Val Gly val Pro Gly val Gly Val Pro Gly 
1 5 10 15 

val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val 
20 25 30 

Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly 
35 40 45 

val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val 
50 55 60 

Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro 
65 70 75 80 

Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly 
85 90 95 

val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val 
100 105 110 

Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly 
115 120 125 

val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val 
130 135 140 

Pro Gly val Gly Val Pro Gly Val Gly Val Pro Gly Val Gly Val Pro 
145 150 155 160 

Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly 
165 170 ' 175 

val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val 
180 185 190 

Gly val Pro Gly Val Gly val Pro Gly val Gly val Pro Gly val Gly 
195 200 205 

val Pro Gly val Gly Val Pro Gly val Gly val Pro Gly val Gly val 
210 215 220 

Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro 
225 230 235 240 

Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly 
245 250 255 

val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val 
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260 265 270 

Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly 
275 280 285 

val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val 
290 295 300 

Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro 
305 310 315 320 

Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly 
325 330 335 

val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val 
340 345 350 

Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly 
355 360 365 

val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val 
370 375 380 

Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro 
385 390 395 400 

Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly 
405 410 415 

val Gly Val Pro Gly Val Gly val Pro Gly val Gly val Pro Gly val 
420 425 430 

Gly val Pro Gly val Gly Val Pro Gly val Gly Val Pro Gly val Gly 
435 440 445 

val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val 
450 455 460 

Pro Gly Val Gly val Pro Gly val Gly val Pro Gly Val Gly Val Pro 
465 470 475 480 

Gly Val Gly val Pro Gly Val Gly val Pro Gly Val Gly Val Pro Gly 
485 490 495 

val Gly val Pro Gly Val Gly val Pro Gly val Gly val Pro Gly val 
500 505 510 

Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly 
515 520 525 

val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val 

Page 28 



530 



CL-1806-conversion-CIP 3-5-04. ST25 
535 540 



Pro Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro 
545 550 555 560 

Gly val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly 
565 570 575 

val Gly val Pro Gly val Gly val Pro Gly val Gly val Pro Gly val 
580 585 590 

Gly val Pro Gly val Gly val Pro Gly val Gly val Pro 
595 600 605 

<210> 63 

<211> 8 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Coding sequence introduced by oligomer HGUSH-n 

<400> 63 

Met Ala His His His His His His 
1 5 

<210> 64 

<211> 13 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Insertion sequence in pGYlOl (a pBl uscri pt-based plasmid) 

<400> 64 

Met Ala Arg ser Arg Gly Ser His His His His His His 
1 5 10 

<210> 65 

<211> 6 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Residues deleted from intN to create GUSn/intn(6) fusion 

<400> 65 

Cys Leu ser Phe Gly Thr 
1 5 

<210> 66 

<211> 12 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> 12-amino acid N-terminal amino acid extension to GUS ORF 
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<400> 66 

lie Thr Ser Tyr Ser lie His Tyr Thr Lys Leu Leu 
15 10 



<210> 67 

<211> 21 

<212> DNA 

<213> artificial sequence 
<220> 

<223> primer 

<400> 67 

gtatcaccgt ttgtgtgaac a 



<210> 68 

<211> 32 

<212> DNA 

<213> artificial sequence 
<220> 

<223> primer 

<400> 68 

ggaattcctc agatgttctc caaagtaagg ag 



<210> 69 

<211> 32 

<212> DNA 

<213> artificial sequence 
<220> 

<223> primer 

<400> 69 

ggaattcctc aatctgtagt caagaatcta tg 



<210> 70 

<211> 32 

<212> DNA 

<213> artificial sequence 
<220> 

<223> primer 

<400> 70 

ggaattcctc aaacctcttg ttcaccccta tc 



<210> 71 
<211> 78 
<212> PRT 

<213> Synechocystis sp. PCC6803 
<400> 71 

Cys Leu ser Phe Gly Thr Glu lie Leu Thr val Glu Tyr Gly Pro Leu 
15 10 15 



Pro lie Gly Lys lie Val Ser Glu Glu lie Asn cys Ser Val Tyr Ser 
20 25 30 
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val Asp Pro G"lu Gly Arg val Tyr Thr Gin Ala lie Ala Gin Trp His 
35 40 45 

Asp Arg Gly Glu Gin Glu val Leu Glu Tyr Glu Leu Glu Asp Gly ser 
50 55 60 

Val lie Arg Ala Thr Ser Asp His Arg Phe Leu Thr Thr Asp 
65 70 75 

<210> 72 
<211> 234 
<212> DNA 

<213> artificial sequence 
<220> 

<223> plant codon optimized IntN encoding sequence 
<400> 72 

tgcctttctt tcggaactga gatccttacc gttgagtacg gaccacttcc tattggtaag 60 
atcgtttctg aggaaattaa ctgctcagtg tactctgttg atccagaagg aagagtttac 120 
actcaggcta tcgcacaatg gcacgatagg ggtgaacaag aggttctcga gtacgagctt 180 
gaagatggat ccgttattcg tgctacctct gaccatagat tcttgactac agat 234 

<210> 73 
<211> 12 
<212> PRT 

<213> Synechocystis sp. PCC6803 
<400> 73 

Cys Leu Ser Phe Gly Thr Glu lie Leu Thr Val Glu 
15 10 

<210> 74 
<211> 15 
<212> PRT 

<213> Synechocystis sp. PCC6803 
<400> 74 

Asp Gly Ser Val lie Arg Ala Thr Ser Asp His Arg Phe Leu Thr 
15 10 15 

<210> 75 
<211> 13 
<212> PRT 

<213> Trichodesmium erythraeum 
<400> 75 

Cys Leu Thr Tyr Glu Thr Glu lie Met Thr Val Glu Tyr 
15 10 



<210> 76 
<211> 15 
<212> PRT 
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<213> Trichodesmium erythraeum 

<400> 76 

Asp Gly Thr val lie Arg Ala Thr Pro Glu His Lys Phe Met Thr 
1 5 10 15 



<210> 77 

<211> 12 

<212> PRT 

<213> artificial 

<220> 

<223> compiled sequence 



<220> 

<221> mi sc_feature 
<222> (3).. (5) 

<223> Xaa can be any naturally occurring amino acid 
<220> 

<221> mi sc_feature 
<222> (9).. (9) 

<223> xaa can be any naturally occurring amino acid 
<400> 77 

Cys Leu xaa xaa xaa Thr Glu lie xaa Thr val Glu 
15 10 



<210> 78 

<211> 15 

<212> PRT 

<213> artificial 

<220> 

<223> compiled sequence 



<220> 

<221> mi sc_f eature 

<222> (3). .(3) 

<223> xaa can be any naturally occurring amino acid 
<220> 

<221> mi sc_f eature 

<222> (9) . . (10) 

<223> xaa can be any naturally occurring amino acid 
<220> 

<221> mi sc_f eature 

<222> (12). .(12) 

<223> xaa can be any naturally occurring amino acid 
<220> 

<221> mi sc_f eature 

<222> (14). .(14) 

<223> xaa can be any naturally occurring amino acid 

<400> 78 

Asp Gly Xaa Val lie Arg Ala Thr xaa xaa His Xaa Phe xaa Thr 
1 5 10 15 



Page 32 



